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: Introductory Stateme.nt 
..The Center for Social Organization of. Schools has -two primary 



bbiectives: to develop a scientific knowledge of how schools a| 

. ■ ■ ■■//-•■■■ 

their students, and to use this knowledge tp develop better school 



[feet 



.practices and organization. . > ■ . i ^ 

The Center works through three/ prog^Ls to achieve its objectives. 
The Schools and" Maturity program i^ sludyirig. the effects of school,, 
family, and peer group experiences on /he development "of , attitudes 
consistent with psychosocial maturity/ The objectives are to formu-^- 
late, assess, and research importan^ educational goals other- than 
traditional academic achlevenitit. Ahe School Organization program is^ 
currently concernedlwith authority-control structures, task structures, 

r grot 



Reward systems, and peer gfoup processes ^n schools. The Careers,/ ^ ^ 
program (formerly Careers^and cGrricula) bases its Work upoi^ /theory 
of career development. At haV developed a self-administere<i''vocational 
guidance device and a ^If-d/rected career, program to promote vocational 
development and to fo/^er ^kisfying curricular decisions for high 
school, college, andf adult' populations. / . 

This report, p4epar/d by the School Organization program, examines 

■ ■ ' / / . - ■ > ' - ■ 

methods of assessing th^ effectiveness of schools and educational 



programs in promoting Educational growth of students. 



■ / 



Abstract 

- Artificial data were used to. assess the correlation between 
several estimates of average student change in various schools and 
the "true" impact of those schools. Results indicate Mtat all 
■ estimates involving pretest-postt'est diffe rences measure scho ol_l_ 



"i^pacT with reasonable accuracy. It is important to measure change 
o>fer the entire course of learning, however, and not just over the. 
later stages of learning^;„..Jlhg correlations- between change scores and 
other school chatacteristics reflect with reasonable accuracy the 
relationships between those characteristics and impact, but will be . 
large only, when the underlying relationships are substantial. _ _ 
Simple gaiiT scores Seasiiii^^he true situation about ^s accurately as 
other change estima'tes, are easier to compute, and ^probably are more 
meaningful to non- researchers. 
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Introduction . . ^ 

. ■ V . - ■ . • . 

A basic purpose of education is to promote desirable change or 
growth 'in the educational attainment of students. It foHows that ^ 
schobls cr other educational programs should be evaluated largely on 
their effectiveness In promoting such change* There are many theoretical 
prbbletns in estitnatingrstudent change ;Erom scores on. standard tests of 
educatiortal attainment/however, and thetfe probl^^ are heightened in 
the typicrfl^itj^tion where the students entering various schools differ 
systematically (As tin and Panos, 1971; Cronbach and Furby, 1970; Ha«is, 
1963; Herriott and Muse, 1973; Klittgard and Hall, 1973; O'Cqnnor, 1972). 
It has been difficult, to assess thie practical importance of the sie 



jthe.oxe.tical^prob^lems-becau)r(S^ru^ scores are unknown, in. most., 

longitudinal research. ilQcently, a computer ptorce'dure was' developed to' 
provide artifi^cial data in which tliese true change scores are known 
(Richards, Karweit, and PreyatV, in. press) . When such ayrtificial' data 
were used to compare several statistical techniques for ^assessing change 
in individual students (Richards, 1974), the results indicated tjiat 
individual change'^is measured with reasonable ^accu/acy by all techniques 
that involve the difference between the pretest and the posttest . Th T 
particular, the ^simple difference between the pretest and the posttest 
is.about as accurate 'as other change estimates, such as regressed gain 
scores, and is much easier to comflpte than other estimates. These trends 
hold even when students are assigned nonrandomly to schools that differ 

^in th^ir impact on fiftudents. ' ^ 



/ 



. .... * These results strongly suggest jthat the theoreticaLproblems-of • * 
change measures have limited practical significance -tor measuring : 
individual growth., and iJ: is important to determine whether this J.s also_^ 
the case for measuring school impact. Accordingly, in this study artifi- 
..cia). data were used' to assess the correlation between several estimate's ~ 
of average student change in various schools and the "true" impav-t of ■ 
the. s^me schools. This study is. stated in 'the.. context of educatioi^, but 
the pifocedures for generating. data* and measuring change are abstract. 
Therefore, the results should generalize to many situations where one 

• wishes ^o^compare^^he^imi^^^^ interventions. 

. . , . : Method ■ •. \ T V , ■ 

> Simulat ion Procedure . Because it, seems desirable ^f or artificial 
data to -resemble real data as^ closely alspossib^le, the computer procedure 

_ — --- — - _ _ - ' - 

• was.desigr^ed (Richards, et al. , in press) to reproduce selected aspects- 
of the ETS Growth Study (Hilton, Beaton, and Bower, 1971) and of the 
Project TALENT stu4y of high schools in the Unitecl States (Flanagan, 
et al. , 1962^. Ii\ the ETS Growth Study-,"students were assessed initially 
With a measure 6f academic potential (SCAT) and a measure of educational 
attainment (STEP).\/Subjept 't6 the usual attritio.n in longitudinal' - 
research, the educatic^nal attainment of these students was reassessed 
on three subsWbuent occakons. Project , TALENT provided' intexcorre la tions 
•among a variety of communityy school, ^nd student chatacteristic> for 
a representative sample of U. SVhigh schools. ' - ^ 
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The computer procedure generates scores for irtdividual_students 
that strive-* to reproduce th^'means^ standard d£vl^itl^s^^_and^ 
^^elat^ons~6b::s[tcs!^^^ The student's score on 

academic pot4ntia,l is generated first and used to derive that student's ' 
score on initial academic attainment.- Theil gain scores 'are generated^ 

- f ' ' ' , , , ~ * ^ 

■ " ' \ • ■ 

and added to yield subsequent attainment .^cores.. Ti;ue standard scores 

are generated initially, then .the appropriate" amount of random error* is 

added to each score and the" scores are transforfired to the metric of the 

ETS Growth Study_dbserved scores. |^^ij|jsimulation procedure closely 

xep,roduces the ETS Growth Study resoffs (Richards^ 1^974).. : 

The simulation procedure, permits the, investigator to assign students 

to schools either randomly or nonrandomly. When itudehts are assigned. , 
"nonrandomly, the program strives to reproduce the average correlation 

between community per capita income and aye rage academic potfential of 
^students-estim^ted-fronrPtolecrTnEM .54). The ratio 

of between schools variance to total variance also simulates the .Project 

■TAIENT/ ratio\ ' ' _ ' ' 

The simulation procedure 'TsFffies^^that community per capita income ■ 

-p- , ' ; - ' - ' _ ■ _ ■ - v ■ / 

determines school' resources, and that school resources in turn determine 

: • ■■ ' ■ ^ ■'("-- ■ 

school impact. A review of Project TALENT results sugge^t^d an average 
correlation of approximately .25 between congnuriity. income and thoge_____ 



.sck^iol:--^0!^^:ss-isamonry assumed to facilitate student growth, sq the 
simulation procedure strives, to reproduce this relationship between 
income and resources. Community- income is.dra^n randomly from a normal 



distribution, and it is assumed that' school resources and school impact 
a*l:so^are-normally distributed.- - , / . ' * 



There is little empirical basis for estimatifig either the correla-^^ 
^txon between resources and, i^pa^t^or^^the extentr^^to i^fc^ schools vary 
-in'lmpact. Therefore, the simulation procedure allows tha investigatol: 
to specify both the correlation between resources and impact and the 
standard deviatibn of the impact variable. This standard deviation is^ 
specified in the form of a number between 0 and 1 .y When the standard 
deviation is .10, the average growth values used/in .generating scores 
are equal to the average growth scores obtained in the ETS study for a., 
school. with average impact, and are 10% higher* than the ETS averages for 
a school one standard deviation above the mean on impact. (The sliulated 
dat^ appear to meet tho assumptions for thi? manipulation even if the 
ETS. data do not.) _^ 

Gain, scores for individuals are generated according to the following 



principle: 



_.t-^ m d 



f ^ — 



where is total (true) growth, G ^ is average (or mean) growth (i*e. , 
th'e parameter estimated from the ERS data)' and G^ is a devi^^ion^^ 
thlsayerage-^thAt-Hfep^^ differences in true growth./ The 

total gain score is added to, the pretest score to yield the posttest 
score, and the posttest score then becomes the pretest for the next ^ 
growth interval. For each gtowth. interval., the pretest is one of the 
elements entering a multiple regression formula used to generate the 



G values. The correlations between pretest and growth become increasingly 

. d ..." . ' ' . 

negative for' successive intervals- (Richa^ds^^974^)_^„^^p^----' -^^^^"^ 

getveratTng s^ the mean growth parameters for the three 

intervals are adjusted^for schoc^l impact^ a^^^^ no otljier changes are made. 

Consequently, the adjusted mean growthj?a^Mmete^^^^ 

equ^r"^" the ^^tained average true growth scores for a given school* 
A school with above average jimpact will , have higher than average-me^^ 



and therefore higher than average true posttest .scores, 
sr than average true pre test scores for subsequent 



growth parameters! 
These become high 

' lea rning^ intervals, and these higher pretest scores make an Incl^easijigly 
negative contribution in the computation of subsequent true growth scores. 
The averages of the obtained true growth scores for that school will tend 
to" be lower than the adjusted mean growth. parameters. Similarly, the 
averages of the obtained true growtK scores will tend to be higher than 
the adjusted mean growth parameters for a school with below average impact. 
Table 1 presents a simplified illustration of these trends for five > 
hypothetical schools that are average in every respect except far differing 
'in . impact. Because other parameters besides pretest score are- involve d^^ 



Insert Table 1 ^About Here 




_J^i_generaLing-Jscores-(RichardsT^9T^ 
with below average impact (and therefore below average adjusted mean 
growth parameters)_wiU have higher average obtained-true growth scores J 
than a school with above, average inpact. This is especially true when 

- students are assigned to' schools nbnrahdomly. * 




bata^Sets simulated data were generated 

for the present' study. In each^^stjid^nts- were-assigned; to"i00"f^^^^^ 



of treatments^ 'the number of students per school varied randomly w^ith 
mean = 150 and standard deviation » 15. llierefore, the total number of 
students in each of these sex sets' was approximately 15,000. 

In three of these sets students were assigned randomly to schools 



or treatments, and in the other three sets^tudents-were-assigrted 

• ". " ------- -p. ^ ' - ' J - v.--- 

nonrandomly: uider each-type-of 7 assignment, simulated data w^e generated 



■ 

for three different assumptions about the relationship between Scjiool 



resources and school iiniiact. Specifically /it was assumed that school 
isources account\for 5%, 20%, or 80% of the Variance in school impact 



4472, or .8944). " ' ^ 

deviation of the impact variable 



/(corresponding to correlations of .2236, 

'Finally, in all six sets the standard 
wasVsef at .10. UtUpproximately this magnitude two simulated schools 
one standard deviation apart, on impact (with N's = 150)' will differ <at 



the .05 level jwheja_-compar6d with-TTCS^ecT^o'educa^ growth between 



s\jccessive occasions. 



_ Chanf^e Measure s . wide variety, of 



(Cronbach and Furby, 1970), but recfent results suggest that most of the^e 
measures yield essentJ^IJ^e^ui^^ 

ingly, this study used only four measures, of change, each representing 
a different approach to estimating change*. These change estimates 
Included: 

1. Pbsttest scoire. 

2/ Posttest score adjusted for initial academic potential, 'fhis 
change estimate Is th^ difference between posttest score and 



change measures liave been proposed 



----- ^ r • ' \ ' St '--""^ ' , I - 

preditte4 posttest score, U8,ing initial academic potential as 

the predictor. (The prredict^ion equation for each d^t» set was 
r ^based on the observed relationships in that set.) Thus, fhts 

technique' resembles analysis of covariarice with academic poten- 

tial treated as the covariate^ ^ , 
3. llaw gain. This xha?ike score is the simple difference between 
^ pretest score and postte&t score\ / > 



4. ^ R&w residual gainl This, estimate is the difference between 
^ostte'st score and^predic^ed posttest score, using pretjest 



score as the predictor. 



Results 



To facilitate comparison with the earlier study of .individual change 
estimates CRichards,_1974) 'the first step in the data analysis was to 
compute the correlations] between^average estimated change scores for . \ 
various schools and average, triie change scores for the same schools. An 
unresolved question is whether it^ is better\fo compute change scores for \ 
individual students^a^^ orji^^cOTiput^^ 
,scorej_fcrom^school-meanr Cl^r7^ Patton, 1969), so both procedures 

were used to estimate change in this analysis. Table 2 summarizes the 
re^ult^*^ \ ^ ' 

% j^--.- — - ; \ : 

Insert Table 2 About Here 
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These results, seem quite consistent .witii the restAts of the earlier^ 
study of individual change estimates (Richards, 1974). Chaise is estimated 



most actiirately by techniques that .involve the ditterence between the 

. * . ^ . - »- » . 

^pretest and the posttest, and these techniques seem equally accurate . 
(i.e/, raw gain is just as accurate as residual gain). For the most 
part, there is little difference between change estimates based on , 
individual students and change estimates based on school means. In a^ 
few/cases estimates based on school means have a clear advantage and 
these estimates are also easier, to compute, so subsequent/ aftialyses in 

this paper involve only eistimates based on school means, . 

■ * . ^' ' 

The next analysife evaluated/ the accuracy of these change estimates 

as measures of school impact A Table 3 .summarizes the correlations between 

impact and^ various change estitnates. For comparative purposes; this 

table awt^iSumm^Mrizes the correlations between impact and average true 

' growth scores. 



'Insert Table 3 About Here 



^ These results indicate that change estimates can be quite effective 

. ^ \ 

in rank ordering schools with respect to their impact even when students 
. are assigned to schools honrandomly. The simple gain scores again were 
just'as accurate^^ the 
(1970), point out* postte 
are^ assigned to treatments randomly^ 

The results alsp indicate that it is importattt^^to^ measure change 
over an appropriate interval. Adjusted potttest "scores, siujp^e^in 



residual gain scores arid, as Cronbach arid Furby 
sjL score measures impact adequately when students 



scores, and regressed gain scores all. rank ordered schools accurately 
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when they involved change from initial status, but none of th^ measures 



we're particularly effective in rank ordering schools when they involved 

growth in the' later stages of the* learning process. This ineffective- 

ness re^flected the true situation, because it is also characteristic 

of the true growth scores. The ETS data resemble oth^r longitudinal or 

learning data in a number of respects (Richards, 1974) , so these findings 

about when to measure change should have considerable generalizability. 

The final question examined ih^^this study involves the relationships 

am(^ng these change tifeasures and the school characteristics that cause >^ 

satiations in impact. Such results are more typical of what would be 

obtained in a "real^* longitudinal study. Table 4 summarizes the relevant 

correlations between resources and change. The magnitudes of these 

- • - . j\ ■ - ' ^ r [ ' ' * 

correlations clearly foliow the underlying relationship^ between resources 

and impact, but are somewhat lower. The smaller magnitude of these 



Insert Table 4 About Here 



correlations perhaps is partly the. consequence of unreliability of the 

change scores/, but also appears to refl'ect the imperfejc$: correspondence 

* / . ' . ' . - . ^ • • " , • 

between school impact and average true change. The results again indicate 

that raw gain is about as accurate as any other change estimate, reempha- 

sizie the importance of measuring change' over an app|/opriate interval,^ 

and suggest that the correlation between a school characteristic and 

school impact must be reasonably substantial before anj^ change score/ 

will reveal the relationship. 



;^ Discussion ^ 

Theoretical treatments of the issues considered in this paper have 
emphasized the theoretical difficulties of using change scores in genera^ 
and of usi\ig. simple gain scores in particular. The results of this study, 
^ like those of the earlier study of individual change '( Richards, 1974), 



suggest that the practical importance of these theoretical difficulties 
may have been exaggerated. It appears that change estimates ovex-an^^'^^^ 
a ppropriate interval (e.g., the entire course^ f^earning, not just the ^\ 
•later stages) do measure-scKooTl^pa^ with reasonable accuracy'. The 
correlations between change scores and other school characteristics 

reflect with reasonable accuracy the relationships between the same charn 

. ' ' - ^ ^ t ^ \ 

acteristics ancl school impact, but consequently will be large, (or ^'slgni- \ 

ficant") only wh4n the underlying relationship is fairly substantial^ 
These conclusions appear relatively unaffected by random v6. nonrandom 
assignment of . students (although this finding could change for more severe 
nonrandomness) , or .by "whijther change -measures involve individual scores, 
or. school, means. * 

Insensitivity -to weak relationships almost certainly is character- ^ 



istic not just of change scores; but of all staf?^ical- procedures that 
might be applied to these data, .and simple,gain scores appear to reflect 
the true situation about as accurately as-^ any other estimate of change 
or impac't/ Simple gain scores also are easier to computie than most other 
estimates and probably are more meaningful to non- researchers. Therefore, 
^the^results of this study suggest that It often may be quite appropriate 



^It should be. emphasized that these .conclusions apply to true longitudinal 
designs and this study should not be used to justify such procedures as 
measuring impact by educational attaimnent. adjusted for a test of academic 
potential administered at the same time.", 

* 

' ' , -10 k?- ■ ■ ■ : \ 



.^-^o compare educational programs on the basis of simple pretest-posttest 

/ " * • ' » 

differences* ^ 

The. discrepancy between this study and, earlier theoretical treat- 
ments may^erhaps best be resolved in terms of^egree of concern about 
"Type I" errors. That is, "theoretical treatments usually seem to assume., 
that educational treatments do not differ on impact and emphasize the 
possibility^^ that use of change scores, particularly simple gain scores, 
will lead to the false conclusion- that they do. differ. Certainly this 
poslibility cannot be ignored, especially when^he^stuHfents assigned to 
various treatments differ considerably (Astin and Panos, 1971; Cronbach 
and Furby, 1970), and certainly it is possible .to propose hypothetical 
situations where change scores could be misleading or confusing, especially 
. if one has a taste for paradoxes (Lord, 1967). ^'ihis study, on the other 
hand, assume! that schpols do differ on impact ^and asked how accurately 
change scores describe these differences. The ^/answer to this question 
appears much more favorable to change scores. / Indeed, the results 



\ - 

V 



suggest that\when one uses change scores^ over , an inappropriate interval' 
in a correlational study there may be a gffea|[er dange.r of |the false 
conclusion that schools do not differ with Respect to impact than of th^ 

false conclusion that schools do differ. 

* * *^ - ^ 

Cronbach and Furby (1970) correctly p'oint but that some .of the 

/ . ^ . 1 

questions to which change scores might be applied could be an 
directly with such techniques as partial correIatit>n. The advantages /of 
such techniques a.re that they are more direct than change scores, howpver, 
riot that they are more accurate, nor that they require less statistical 
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sophistication. The tfesults of this study, lend support to the investigator 

who prefe^rs to use change scores for reasons of convenience or ease of 

• • .\ 

understanding. \ 

Finally, the results of this study again illustrate the usefulness 
of simulation, techniques for Investigations of longitudinal methodology. - 
It would be impos-sible to investigate the questions considered in this 
study with "real" longitudinal data because the investigator would have 
. no way of knowing either the true individual growth scores or the true 
school impact scores'. At best one couLd compute the iritercorrelations 
among -different estimates of change (Dy«, et al . 1969) . With simulated . 
data it was easy to compute the correlations between true scores and the 
different estimated scores. It" would also be easy to extend the simulation 
procedures to the situation where consideral>le attrition of subjects occurs 
to the situation where one has only pseudo-longitudinal data (e;g., test^ 
scores for Occasions 1 'nd 2 obtained from different groups of students 
in the same ^chool), or to different models for growth; Thus, Isimillation 
techniques offer considerable promise for refining our knowledge about 
. when various procedures for analyzing longitudinal data are appropriate. 
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